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The MAILING DATE of this communication appears on the cover sheet with the correspondence address - 
Period for Reply 

A SHORTENED STATUTORY PERIOD FOR REPLY IS SET TO EXPIRE 3 MONTH(S) FROM 
THE MAILING DATE OF THIS COMMUNICATION. 

- Extensions of time may be available under the provisions of 37 CFR 1 . 1 36(a). In no event, however, may a reply be timely filed 
after SIX (6) MONTHS from the mailing date of this communication. 

- If the period for reply specified above is less than thirty (30) days, a reply within the statutory minimum of thirty (30) days will be considered timely. 

- If NO period for reply is specified above, the maximum statutory period will apply and will expire SIX (6) MONTHS from the mailing date of this communication. 
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Any reply received by the Office later than three months after the mailing date of this communication, even if timely filed, may reduce any 
earned patent term adjustment. See 37 CFR 1 .704(b). 

Status 

1 )□ Responsive to communication(s) filed on . 

2a)D This action is FINAL. 2b)E3 This action is non-final. 

3) Q Since this application is in condition for allowance except for formal matters, prosecution as to the merits is 

closed in accordance with the practice under Ex parte Quayle, 1935 CD. 1 1 , 453 O.G. 213. 
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4) E3 Claim(s) 1-33 is/are pending in the application. 
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5) D Claim(s) is/are allowed. 

6) ^ Claim(s) 7-7. 70-22. and 25-33 is/are rejected. 
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DETAILED ACTION 



Specification 

1 . The arrangement of the disclosed application does not conform with 37 CFR 1 .77(b). 
Section headings are boldfaced throughout the disclosed specification. Section headings should 
not be boldfaced. Appropriate corrections are required according to the guidelines provided 
below: 



2. The following guidelines illustrate the preferred layout for the specification of a utility 
application. These guidelines are suggested for the applicant's use. 

Arrangement of the Specification 

As provided in 37 CFR 1.77(b), the specification of a utility application should include 
the following sections in order. Each of the lettered items should appear in upper case, without 
underlining or bold type, as a section heading. If no text follows the section heading, the phrase 
"Not Applicable" should follow the section heading: 

(a) TITLE OF THE INVENTION. 

(b) CROSS-REFERENCE TO RELATED APPLICATIONS. 

(c) STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR 

DEVELOPMENT. 

(d) INCORPORATION-BY-REFERENCE OF MATERIAL SUBMITTED ON A 

COMPACT DISC (See 37 CFR 1.52(e)(5) and MPEP 608.05. Computer program 
listings (37 CFR 1.96(c)), "Sequence Listings" (37 CFR 1.821(c)), and tables 
having more than 50 pages of text are permitted to be submitted on compact 
discs.) or 

REFERENCE TO A "MICROFICHE APPENDIX" (See MPEP § 608.05(a). 
"Microfiche Appendices" were accepted by the Office until March 1, 2001 .) 

(e) BACKGROUND OF THE INVENTION. 

(1) Field of the Invention. 

(2) Description of Related Art including information disclosed under 37 CFR 1.97 
and 1.98. 

(f) BRIEF SUMMARY OF THE INVENTION. 

(g) BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWING(S). 

(h) DETAILED DESCRIPTION OF THE INVENTION. 
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(i) CLAIM OR CLAIMS (commencing on a separate sheet). 

(j) ABSTRACT OF THE DISCLOSURE (commencing on a separate sheet). 

(k) SEQUENCE LISTING (See MPEP § 2424 and 37 CFR 1.821-1.825. A "Sequence 
Listing" is required on paper if the application discloses a nucleotide or amino 
acid sequence as defined in 37 CFR 1.821(a) and if the required "Sequence 
Listing" is not submitted as an electronic document on compact disc). 



3. Claims 15 and 21-22 are objected to because of the following informalities: 

Claims 15 and 21 are objected to because they end with two periods. Claims should start 
with a capital letter and end with a period (See MPEP 608.0 l(m)). 

Claim 22 is objected to because it does not end with a period. Claims should start with a 
capital letter and end with a period (See MPEP 608.01(m)). 

Appropriate correction is required. 



4. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the 
basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(e) the invention was described in (1) an application for patent, published under section 122(b), by another filed 
in the United States before the invention by the applicant for patent or (2) a patent granted on an application for 
patent by another filed in the United States before the invention by the applicant for patent, except that an 
international application filed under the treaty defined in section 351(a) shall have the effects for purposes of this 
subsection of an application filed in the United States only if the international application designated the United 
States and was published under Article 21(2) of such treaty in the English language. 



Claim Objections 



Claim Rejections - 35 USC §102 
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5. Claims 1-3, 7, 16-18, 22, and 31 are rejected under 35 U.S.C. 102(e) as being anticipated 
bv Gomes et al. (U.S. patent No. 6,615,209 Bl). 

As to claim 1, Gomes et al. teaches a method for processing data representing documents, 
comprising: 

for individual documents of a set of documents, executing a software program to obtain a 
list of terms found in each document (see column 10, line 42 through column 12, line 58); 

comparing the list of terms for a first document to the list of terms for a second document 
(see column 12, line 59 through column 13, line 41); and 

declaring the first document to be substantially identical to, or substantially similar to, the 
second document if some predetermined number of terms are found in each of the lists of the 
first document and the second document (see column 8, lines 37-60). 

As to claims 2 and 17, Gomes et al. does not teach wherein if the predetermined number 
is about 90% of the terms or greater the first document is declared to be substantially identical to 
the second document (see column 12, line 59 through column 13, line 41). 



As to claims 3 and 18, Gomes et al. teaches wherein the set of documents is obtained in 
response to a search query made to a data communications network (see column 5, lines 43-65). 
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As to claims 7 and 22, Gomes et al. teaches wherein the step of executing a software 
program assigns to each term a collection-level importance ranking or Information Quotient 
(IQ), and wherein the IQ is considered during the step of comparing (see column 13, lines 1-22). 

As to claim 16, Gomes et al. teaches a system for processing data representing documents 
comprising, for individual documents of a set of documents, a processor for executing a software 
program to obtain a list of terms found in each document (see column 10, line 42 through column 
12, line 58) and for comparing the list of terms for a first document to the list of terms for a 
second document (see column 12, line 59 through column 13, line 41), said processor being 
operable for declaring the first document to be substantially identical to, or substantially similar 
to, the second document if some predetermined number of terms are found in each of the lists of 
the first document and the second document (see column 8, lines 37-60). 

As to claim 31, Gomes et al. teaches a computer program recorded on a computer- 
readable media, said computer program comprising instructions for directing a data processor to 
process data representing documents by, for individual documents of a set of documents, 
obtaining a list of terms found in each document (see column 10, line 42 through column 12, line 
58); comparing the list of terms for a first document to the list of terms for a second document 
(see column 12, line 59 through column 13, line 41); and declaring the first document to be 
substantially identical to, or substantially similar to, the second document if some predetermined 
number of terms are found in each of the lists of the first document and the second document 
(see column 8, lines 37-60). 
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6. Claims 10, 12-13, 25, 27-28, and 32 are rejected under 35 U.S.C. 102(e) as being 
anticipated by Push et al. (U.S. patent No. 6,658,423 Bl). 

As to claim 10, Pugh et al. teaches method for processing data representing documents, 
comprising: 

for individual ones of documents, executing a software program to obtain a list of terms 
found in each document (see column 11, lines 1-60); 

computing a document signature for each document from the list of terms obtained for 
the document (see column 11, line 61 through column 13, line 67, where the number of lists is 
one); 

comparing the document signature for a first document to the document signature for a 
second document (see column 14, lines 1-42); and 

declaring the first document to be substantially identical to the second document if the 
document signatures are substantially equal (see column 14, lines 25-35). 

As to claims 12 and 27, Pugh et al. teaches wherein the documents are obtained in 
response to a search query made to a data communications network, and where the steps of 
comparing and declaring are executed in substantially real time as the documents are returned by 
the query (see column 19, lines 7-9). 
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As to claims 13 and 28, Pugh et al. teaches wherein the documents are obtained in 
response to a search query made to a data communications network, where the steps of 
comparing and declaring are executed in substantially real time as the documents are received 
from the data communications network, and for a case where a received document is found to be 
substantially identical to an already received document, returning only one of the documents in 
response to the search query (see column 19, lines 7-9). 

As to claim 25, Pugh et al. teaches a system for processing data representing documents, 
comprising, for individual documents of a set of documents, a processor for executing a software 
program to obtain a list of terms found in each document (see column 11, lines 1-60), for 
computing a document signature for each document from the list of terms obtained for the 
document (see column 11, line 61 through column 13, line 67, where the number of lists is one); 
for comparing the document signature for a first document to the document signature for a 
second document (see column 14, lines 1-42); and for declaring the first document to be 
substantially identical to the second document if the document signatures are equal (see column 
14, lines 25-35). 

As to claim 32, Pugh et al. teaches a computer program recorded on a computer-readable 
media, said computer program comprising instructions for directing a data processor to process 
data representing documents by, for individual ones of documents, obtaining a list of terms found 
in each document (see column 11, lines 1-60); computing a document signature for each 
document from the list of terms obtained for the document (see column 11, line 61 through 
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column 13, line 67, where the number of lists is one); comparing the document signature for a 
first document to the document signature for a second document (see column 14, lines 1-42); and 
declaring the first document to be substantially identical to the second document if the document 
signatures are equal (see column 14, lines 25-35). 

Claim Rejections - 35 USC §103 

7. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
such that the subject matter as a whole would have been obvious at the time the invention was made to a person 
having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the 
manner in which the invention was made. 

8. Claims 4-6 and 19-21 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Gomes et al. (U.S. patent No. 6,615,209 Bl) in view of Kathrow et al. (U.S. patent No. 
6,263,348 Bl). 

As to claims 4 and 19, Gomes et al. does not teach further comprising storing the lists of 
terms in a database. 

Kathrow et al. teaches a way of identifying the existence differences between two files 
(see abstract), in which he teaches further comprising storing the lists of terms in a database (see 
column 5, lines 26-43). 
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Therefore, it would have been obvious to a person having ordinary skill in the art at the 
time the invention was made to have modified Gomes et al to include further comprising storing 
the lists of terms in a database. 

It would have been obvious to a person having ordinary skill in the art at the time the 
invention was made to have modified Gomes et al. by the teachings of Kathrow et al. because 
further comprising storing the lists of terms in a database would allow the document comparison 
to operate periodically (see Kathrow et al. , column 5, lines 29-35). 

As to claims 5 and 20, Gomes et al. does not teach further comprising computing a 
signature for each document, and storing the computed document signature. 

Kathrow et al. teaches further comprising computing a signature for each document (see 
column 5, lines 1 1-25), and storing the computed document signature (see column 5, lines 26- 
43). 

Therefore, it would have been obvious to a person having ordinary skill in the art at the 
time the invention was made to have modified Gomes et al. to include further comprising 
computing a signature for each document, and storing the computed document signature. 

It would have been obvious to a person having ordinary skill in the art at the time the 
invention was made to have modified Gomes et al. by the teachings of Kathrow et al. because 
further comprising computing a signature for each document, and storing the computed 
document signature would allow both similar and identical files to be found at any periodic time 
(see Kathrow et al , abstract, and see column 5, lines 29-35). 
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As to claims 6 and 21, Gomes et al. does not teach further comprising computing a 
signature for each document, and storing the computed document signature in association with 
the list of terms for each document. 

Kathrow et al. teaches further comprising computing a signature for each document (see 
column 5, lines 1 1-25), and storing the computed document signature in association with the list 
of terms for each document (see column 5, lines 26-43). 

Therefore, it would have been obvious to a person having ordinary skill in the art at the 
time the invention was made to have modified Gomes et al. to include further comprising 
computing a signature for each document, and storing the computed document signature in 
association with the list of terms for each document. 

It would have been obvious to a person having ordinary skill in the art at the time the 
invention was made to have modified Gomes et al. by the teachings of Kathrow et al. because 
further comprising computing a signature for each document, and storing the computed 
document signature in association with the list of terms for each document would allow both 
similar and identical files to be found at any periodic time (see Kathrow et al. , abstract, and see 
column 5, lines 29-35). 



9. Claims 1 1, 26, and 33 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Pugh et al. (U.S. patent No. 6,658,423 Bl) in view of Piosenka et al. (U.S. patent No. 4,993,068). 
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As to claims 1 1, 26, and 33, Pugh et al. does not teach wherein the step of computing a 
document signature computes a hash code for each term of the list of terms, and then sums all of 
the hash codes to form the document signature. 

Piosenka et al. teaches a person identifying system wherein he teaches a way of signing a 
document in order to be able to detect tampering to it (see abstract), in which he teaches wherein 
the step of computing a document signature computes a hash code for each term of the list of 
terms, and then sums all of the hash codes to form the document signature (see column 7, lines 7- 
30). 

Therefore, it would have been obvious to a person having ordinary skill in the art at the 
time the invention was made to have modified Pugh et al. to include wherein the step of 
computing a document signature computes a hash code for each term of the list of terms, and 
then sums all of the hash codes to form the document signature. 

It would have been obvious to a person having ordinary skill in the art at the time the 
invention was made to have modified Pugh et al. by the teachings of Piosenka et al. because 
wherein the step of computing a document signature computes a hash code for each term of the 
list of terms, and then sums all of the hash codes to form the document signature would result in 
high probability that digital signatures of modified blocks would differ from the signatures of 
blocks that are the same (see Piosenka et al. , column 7, lines 7-30). 



10. Claims 14-15 and 29-30 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Pugh et al. (U.S. patent No. 6,658,423 Bl) in view of Kathrow et al. (U.S. patent No. 6,263,348 
Bl). 
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As to claims 14 and 29, Pugh et al does not teach further comprising storing the 
computed document signatures in a database. 

Kathrow et al. teaches further comprising storing the computed document signatures in a 
database (see column 5, lines 26-43). 

Therefore, it would have been obvious to a person having ordinary skill in the art at the 
time the invention was made to have modified Pugh et al. to include further comprising storing 
the computed document signatures in a database. 

It would have been obvious to a person having ordinary skill in the art at the time the 
invention was made to have modified Pugh et al. by the teachings of Kathrow et al. because 
further comprising storing the computed document signatures in a database would allow the 
current invention to operate periodically (see Kathrow et al , column 5, lines 29-35). 

As to claim 15, Pugh et al. does not teach further comprising storing the computed 
document signature in association with the list of terms for each document. 

Kathrow et al. teaches further comprising storing the computed document signature in 
association with the list of terms for each document (see column 5, lines 26-43). 

Therefore, it would have been obvious to a person having ordinary skill in the art at the 
time the invention was made to have modified Pugh et al. to include further comprising storing 
the computed document signature in association with the list of terms for each document. 

It would have been obvious to a person having ordinary skill in the art at the time the 
invention was made to have modified Pugh et al. by the teachings of Kathrow et al. because 
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further comprising storing the computed document signature in association with the list of terms 
for each document would allow the current invention to operate periodically (see Kathrow et al . 
column 5, lines 29-35). 

As to claim 30, Pugh et al. as modified, teaches further comprising storing the computed 
document signature in association with the list of terms for each document (see Kathrow et aL , 
column 5, lines 26-43). 

Allowable Subject Matter 

11. Claims 8-9 and 23-24 are objected to as being dependent upon a rejected base claim, but 
would be allowable if rewritten in independent form including all of the limitations of the base 
claim and any intervening claims. 

12. The following is a statement of reasons for the indication of allowable subject matter: 

The prior art of record, Gomes et al. (U.S. patent No. 6,615,209 Bl), Pugh et al. (U.S. 
patent No. 6,658,423 Bl), Kathrow et al. (U.S. patent No. 6,263,348 Bl), and Piosenka et al. 
(U.S. patent No. 4,993,068), do not disclose, teach, or suggest the claimed limitations of (in 
combination with all other features in the claim): 

wherein the step of comparing includes a preliminary step of sorting the documents into a 
document list in order of increasing size, and where the step of comparing compares a given 
document with the next larger documents in the document list, as claimed in claims 8 and 23. 
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The prior art of record, Gomes et al. (U.S. patent No. 6,615,209 Bl), Pugh et al. (U.S. 
patent No. 6,658,423 Bl), Kathrow et al. (U.S. patent No. 6,263,348 Bl), and Piosenka et al. 
(U.S. patent No. 4,993,068), do not disclose, teach, or suggest the claimed limitations of (in 
combination with all other features in the claim): 

wherein the step of comparing includes a preliminary step of sorting the documents into a 
document list in order of increasing size, and where the step of comparing compares a given 
document only with another document in the list that is no more than a predetermined amount 
larger than the given document, as claimed in claims 9 and 24. 

Conclusion 

13. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Jacob F. Betit whose telephone number is (703) 305-3735. The 
examiner can normally be reached on Monday through Friday 9 am to 5 pm. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Dov Popovici can be reached on (703) 305-3830. The fax phone number for the 
organization where this application or proceeding is assigned is 703-872-9306. 
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Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published applications 
may be obtained from either Private PAIR or Public PAIR. Status information for unpublished 
applications is available through Private PAIR only. For more information about the PAIR 
system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). 



jfb 



9 Jul 2004 
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